Least recently used replacement level generating apparatus

ABSTRACT

A least recently used replacement level generator is constructed to include n number of register stages connected in tandem. A comparison circuit associated with each stage except the last stage compare the contents of that stage with an input level value which is to be loaded into the input stage. In the absence of an identical comparison, each stage generates a shift enable signal which is passed on to the next succeeding stage. An identical comparison inhibits the generation of the shift enable signal. Therefore, when a clock signal is applied to the device, register stages, in the presence of a control signal, cause the input level to be loaded into the input stage while the level contents of the register stages are simultaneously shifted through successive stages including the register stage whose contents are identical to the input level under the control of the shift enable signal. The contents of the output register stage accurately and instantaneously defines the least recently used replacement level for use by a cache memory.

RELATED PATENT APPLICATIONS

1. Patent application entitled, "Paged Virtual Cache System", bearingSer. No. 06/811,044, filed on even date and assigned to the sameassignee as named herein, now allowed.

BACKGROUND OF THE INVENTION

1. Field of Use

The present invention relates to apparatus used by a cache memory andmore particularly to apparatus for replacing information within thelocations of such cache memory.

2. Prior Art

It is well known to interpose a cache memory between a centralprocessing unit and main memory. Such arrangements improve theperformance of the processing unit by providing fast access toinstructions and data stored in the cache memory. During normaloperation, when the instructions or data requested by the processingunit are not stored in cache, the block containing the requestedinformation is fetched from main memory. When the cache memory isfilled, new blocks replace old blocks resident therein.

While different arrangements may be used to select old blocks ofinformation, a least recently used (LRU) replacement has been one of themost commonly used schemes employed in data and instruction cache units.These units include cache memories and address directory circuits. Thememories are organized into a number of levels for storing informationin the form of data and instructions for fast access. The directorycircuits contain address information for identifying which blocks ofinstructions and data are stored in the cache memory levels. Generally,the LRU replacement scheme has been implemented using a round robincounter or first in first out (FIFO) array. In such arrangements, theassignment of a group or block of locations is made sequentially.

While such arrangements have been generally easy to implement, they areunable to provide any accurate record of order of block usage. Toovercome these disadvantages, one system employs a memory for storing anumber of least recently used bits to represent the order of usage ofmemory locations. This system is disclosed in U.S. Pat. No. 4,334,289.

When implemented as an array, the updating of entries can betime-consuming, particularly when there are a large number of cachelevel entries. Moreover, the delays in updating reduce cache systemperformance and result in least recently used approximation.

Accordingly, it is a primary object of the present invention to provideapparatus for replacing information within a memory on a least recentlyused basis.

It is a further object of the present invention to provide apparatuscapable of generating signals for rapid and accurate assignment oflevels on a least recently used basis.

SUMMARY OF THE INVENTION

The above and other objects of the present invention are achieved in anapparatus which is organized into n number of register stages connectedin tandem. A comparison circuit associated with each register stageexcept the last stage compares the contents of that stage with an inputlevel value provided by a cache memory which is to be loaded into theinput stage. In the absence of an identical comparison, each stagegenerates a shift enable signal which it passed on to the nextsucceeding stage. An identical comparison inhibits the generation of theshift enable signal.

Therefore, when a clock signal is applied to the device, the registerstages, in the presence of a replacement control signal, cause the inputlevel to be loaded into the input stage while the level contents of theregister stages are simultaneously shifted through successive stagesincluding the register stage whose contents are identical to the inputlevel. Such shifting takes place under the control of the shift enablesignal. The contents of the output register stage can be used by thecache memory to accurately and instantaneously define the least recentlyused replacement level into which information blocks may be stored.

By keeping a record of the status of all of the assignable levels (i.e.,the most recently used, the next most recently used, etc., the last mostrecently used) and being able to switch these values instantaneously,the device can accurately and rapidly provide an output specifying theleast recently used level. In that sense, the device operates as ashifting content addressable memory (CAM).

In the preferred embodiment, the cache memory can take the form of thememories disclosed in the copending patent application entitled, "PagedVirtual Cache System", bearing Ser. No. 06/811,044, filed on even dateand assigned to the same assignee as named herein. The cache memoryapplies a page level number to the input register stage of the devicealong with a hit signal. When so applied, the device operates togenerate the required page number level value for assigning pages on aleast recently used basis.

Since the shifting of register contents is done in parallel, the onlydelays are those of the series of gates which pass the shift enablesignal through successive register stages. When constructed using VLSItechnology, the delays are minimal. The organization of the apparatus ofthe present invention due to its repetitive stages is very well suitedto VLSI technology.

In addition to high speed and ease of construction, the apparatus of thepresent invention can be readily expanded to accommodate increases inthe number of levels. This can be accomplished with minimal increase inadded circuits.

The novel features which are believed to be characteristic of theinvention both as to its organization and method of operation, togetherwith further objects and advantages will be better understood from thefollowing description when considered in connection with theaccompanying drawings. It is to be expressly understood, however, thateach of the drawings are given for the purpose of illustration anddescription only and are not intended as a definition of the limits ofthe present invention.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 illustrates in block form a cache memory which includes theapparatus of the present invention.

FIGS. 2 and 3 are block diagrams showing in greater detail, theapparatus of the present invention.

FIG. 4 is a diagram illustrating the organization of the apparatus ofthe present invention.

FIG. 5 is a flow diagram used in explaining the operation of theapparatus of the present invention.

DESCRIPTION OF THE PREFERRED EMBODIMENT

As seen from FIG. 1, the cache memory 10 which includes the replacementlevel generator 10-6 of the present invention includes cache addresscircuits 10-2, cache control circuits 10-4 and the interface circuits ofblock 10-5. The cache address circuits 10-2 store address informationspecifying where the pages and blocks of data words or instructions arestored in a cache page random access memory (RAM) not shown. Thesecircuits, in response to a request for instructions or data, operate togenerate a page hit signal and page level number value upon detectingthat the requested information resides in the page RAM. As shown, thesesignals are applied as inputs to the interface circuits of block 10-5.

The cache control circuits 10-4 generate the required control and clocksignals for processing requests. These signals are distributed todifferent parts of cache memory 10 including the interface circuits 10-5and generator 10-6. As shown, these signals include an initialize signalgenerated when cache memory 10 is powered up, a clear signal forresetting cache memory registers, an increment signal, and a clocksignal which establishes overall cache timing. For further detailsconcerning the circuits of blocks 10-2 and 10-4, reference may be madeto the referenced related copending patent application.

As shown, the interface circuits 10-5 include an initialize counter10-50 and a pair of two input data selector circuits 10-52 and 10-54whch are connected as shown. The clear signal clears the contents ofcounter 10-50 to ZEROS while the increment signal is used to enable thecounter 10-50 to increment its contents by one in response to each clocksignal.

In response to an initialize signal applied to the input selectterminals of selector circuits 10-52 and 10-54, both circuits areconditioned to select the output of initialize counter 10-50 and theinitialize signal as outputs. In the absence of an initialize signal,the selector circuits 10-52 and 10-54, respectively, select the pagelevel number value and page hit signal as outputs.

The output of selector circuit 10-52 applies a page level number valueas a first input to generator 10-6. The output of selector circuit 10-54applies a shift enable signal as a second input to generate 10-6. Thegenerator 10-6 receives as further inputs, clear signal and a clocksignal from the cache control circuits 10-4. The generator 10-6 providesas an output, a least recently used level number value as an input tothe cache address circuits 10-2.

REPLACEMENT LEVEL GENERATOR 10-6

As shown in FIG. 2, generator 10-6 includes 32 register stages 10-60through 10-92. Each stage includes a level register (e.g. 10-600). Eachstage except the last also includes a comparison circuit (e.g. 10-602)and an AND gate (e.g. 10-604). As shown, each comparison circuitreceives as one input, the level number value applied to input registerstage 10-60 by selector circuit 10-52. As a second input, eachcomparison circuit receives the output level number contents of itsstage register (e.g. 10-600).

The input enable signal is applied as an input to level register 10-600and as one input of the AND gate 10-604 of the input stage whichreceives as a second input, the unequal (≠) output from comparisoncircuit 10-602. AND gate 10-604 provides a shift enable output whichconnects to an enable input terminal of the next stage register 10-620and to one input of the AND gate 10-624 of the next register stage10-62. This AND gate receives the unequal (≠) output from the comparisoncircuit 10-622 of that stage. As seen from FIG. 2, the remainingregister stages are similarly connected in tandem.

Additionally, the level register of each register stage, except theinput register stage 10-60, receives clear and clock signals from thecache control circuits 10-4 and the output from the previous stage. Asseen from FIG. 3, the clear and clock signals are applied to the clear(CLR) and clock (CLK) input terminals of each stage level register. FIG.3 shows the different elements of the register stages in greater detail.For the purpose of the present invention, these elements can beconstructed from well known integrated circuits or implemented in CMOSor MOS technology. For further information, reference may be made to thetext entitled, "MOS Integrated Circuits," prepared by the EngineeringStaff of American Micro-systems, Inc., Copyright 1972.

DESCRIPTION OF OPERATION

With reference to FIGS. 1 through 4, the operation of the generator 10-6will now be described in connection with the flow chart of FIG. 5. Asseen from block 500 of FIG. 5, as part of an initialization sequence,the cache control circuits 10-4 force clear signal to a binary ONE. Thisclears the intitialize counter 10-50 and all of the stage registers ofgenerator 10-6 to binary ZEROS. Next, the cache control circuits 10-4set an internal loop counter to a count equal to N where N has thebinary value of "11111". As seen from block 502, the cache controlcircuits 10-4 force the initialize signal to a binary ONE. Thisconditions the selector circuits 10-52 and 10-54 to select as outputs,the contents of initialize counter 10-50 and the binary ONE initializesignal.

As seen from FIG. 3, this causes the binary value of "00000" frominitialize counter 10-50 to be applied to the input of stage register10-600. This value is loaded into register 10-600 in response to a firstclock signal. Since the inputs applied to comparison circuit 10-602 areequal (i.e., both all ZEROS), no further transfers of levels take placebetween register stages.

As seen from block 504, cache control circuits 10-4 test the loopcounter contents. Since its contents do not equal zero, the circuits10-4 decrement the loop counter by one and increment the initializecounter 10-50 by one. The blocks 504 and 506 are repeated until the loopcount has been decremented to ZERO. At that time the register stages 0through 31 store the 32 binary values 00000 through 11111 as shown inFIG. 4.

The initialize sequence is completed when the cache circuits 10-4 resetthe initialize signal to ZERO. When the initialize signal is a ZERO,selector circuits 10-52 and 10-54 select the page level number value andpage hit signals as inputs to generator 10-6.

The generator 10-6 remains in an initialized stage until the cacheaddress circuits 10-2, in response to a request for data orinstructions, detect a hit indicating that the requested information isstored in cache memory 10. At that time, the cache address circuits 10-2force the page hit signal to a binary ONE and the page level number to avalue designating the most recently used level value (i.e., where thehit occurred).

As seen from block 510, when a hit is detected, this causes thecomparison circuit of each register stage to compare the input pagelevel number value to the level contents of its stage register. Whenthey are not equal, the comparison circuit conditions the AND gate ofthat stage to pass a shift enable signal onto the next register stage asindicated in block 518. When the comparison circuit detects that theinputs are equal, it inhibits the same AND gate from passing the shiftenable signal onto the next register (block 516).

As seen from block 520, when the shift enable signal from the precedingregister stage is received by a register stage, this causes the contentsof the register of the preceding stage to be loaded into the register ofthat stage in response to a clock signal. At the same time, the mostrecently used level value is loaded into register 10-600 of input stage10-60. Thus, the transfer of levels through the register stages occursinstantaneously. The result is that the level contents are shiftedthrough successive stages up to and including the register stage whosecomparison circuit detects that the contents of its register areidentical to the input level. The simultaneity of operations performedby the different register stages are represented in FIG. 5 by the seriesof lines in blocks 512 through 522.

The above operation can be best illustrated by the following example. Itis assumed that the input page level number value has the binary value"11101" and that the register stages of generated 10-6 contain thebinary values shown in FIG. 4. This means that the page hit signal willenable the binary value "11101" to be loaded into stage register 10-600and that the comparison circuit 10-602 and the AND gate 10-604 willenable the "11111" contents of stage register 10-600 to be loaded intostage register 10-620.

At the same time, the comparison circuit 10-622 and the AND gate 10-624will enable the "11110" contents of stage register 10-620 to be loadedinto stage register 10-640. However, the "11101" contents of stageregister 10-640 will not be loaded into the register of the next stage,since the comparison circuit 10-642 will detect an identical comparisonand the AND gate 10-644 will inhibit any further transfer of the shiftenable signal. The generator 10-6 will provide as an output from stageregister 10-920, a least recently used value stored in the level 0register stage which corresponds to the binary value "00000".

As seen from the above, the generator of the present invention providesan accurate and reliable way of generating values for indicating theorder of usage of memory locations, such as in a cache memory on a leastrecently used basis. In operation, the generator can provide therequired values essentially instantaneously. Also, it has the additionaladvantages of being simple and inexpensive to construct using moderntechnologies.

Many changes may be made to the preferred embodiment of the presentinvention. For example, the number and size of the register stages maybe increased or decreased. To increase the speed of operation in thecase of a large number of stages, appropriate carry lookahead circuitsor other types of speed-up circuits may be included. It will be obviousto those skilled in the art that the generator of the present inventionmay be used with other types of memories or devices and, hence, withother types of interface circuits.

While in accordance with the provisions and statutes there has beenillustrated and described the best form of the invention, certainchanges may be made without departing from the spirit of the inventionas set forth in the appended claims and that in some cases, certainfeatures of the invention may be used to advantage without acorresponding use of other features.

What is claimed is:
 1. An apparatus for generating signals designatingthe least recently used level for use in a cache memory having aplurality of levels, each level containing a plurality of addressablelocations, and said cache memory including circuits for generating apage hit signal indicative of hit being detected indicating that thedata requested has been located in said cache memory and a page levelnumber coded for indicating the level in which said hit occurred, saidapparatus comprising:a number of register stages including an inputstage and an output stage, each of said stages for storing a differentreplacement level value and being connected in tandem and each stagehaving only a clocked single multibit register for storing saiddifferent replacement level value; input means coupled to said cachememory for receiving said page hit signal and said page level value forindicating said level in which said hit condition occurred, said inputmeans being enabled during normal operation to transfer said page levelvalue and said page hit signal as replacement level value and shiftenable signal inputs respectively to said input stage multibit register;means for simultaneously applying a single clock signal to each of saidstage registers; each of said register stages except said output stagefurther including: a comparison circuit having an output and first andsecond sets of inputs respectively coupled to receive said differentreplacement value stored in said each stage multibit register and toreceive from said input means, said page level number value; and, logicgating means having first and second inputs and an output, said firstand second inputs being connected to receive said shift enable signaland to said output of said comparison circuit respectively and saidoutput being connected to both a first input of a logic gating means andan input to a stage multibit register of a next adjacent stage, saidcomparison circuit of each stage generating an output signal forindicating when there is the absence of an identical comparison betweensaid page level value and said different replacement value stored insaid each stage multibit register for enabling said logic gating meansto transfer and apply said shift enable signal as said first input tosaid logic gating means and to said multibit register of said nextadjacent stage; said input means being inhibited by said absence of saidpage hit signal from applying said shift enable signal to said inputstage thereby preventing any change in the stored different replacementlevel value contents of said number of stage multibit registers inresponse to said clock signal; and said input and the remaining ones ofsaid number of stages of multibit registers in response to said clocksignal and said shift enable signal applied from said outputs of saidlogic gating means, loading said page level value into said multibitregister of said input stage and simultaneously shifting the storeddifferent replacement level contents through successive stages of saidmultibit registers up to and including the stage whose comparisoncircuit detected said identical comparison and inhibited said logicgating means from transferring and applying said shift enable signal tosaid multibit register and to said first input of said logic gatingmeans of said next adjacent stage and any remaining stages, therebyproviding at said output stage register within a minimum of timefollowing the transfer of said hit signal, a replacement value whichaccurately represents said least recently used replacement level withinsaid cache memory.
 2. The apparatus of claim 1 wherein said gating meansincludes only an AND gate.
 3. The apparatus of claim 1 wherein saidapparatus further includes initialization circuit means coupled to saidinput means, said initialization circuit means including:means forgenerating a predetermined sequence of replacement level values havingthe binary values 0 through n wherein n corresponds to said number ofstages, said binary values 0 through n for establishng the relativepriorities of said cache levels; and wherein said input means includes:first gating means for applying said sequence of replacement values tosaid input stage register; and, second gating means for continuouslyapplying said shift enable signal to said input stage register, saidstage registers being successively conditioned by said shift enablesignal to load different ones of said replacement level values in saidpredetermined sequence in response to successive occurrences of saidclock signal resulting in the storage of said values n through 0 in saidregisters of said input stage through said output stage.
 4. Theapparatus of claim 3 wherein said means for generating includes amultibit binary counter having enable and clock input terminals, saidcounter in response to an increment signal applied to said enable inputterminal to generate said predetermined sequence of said replacementvalues in response to said successive occurrences of said clock signalbeing applied to said clock input terminal.
 5. The apparatus of claim 3wherein said cache memory further includes address circuits and saidfirst gating means includes a selector circuit having first and seconddata inputs, a control input and an output coupled to said inputregister stage, said first and second data inputs being connected tosaid cache address circuits and to said means for generatingrespectively and said control input being connected to receive aninitialization control signal, said selector circuit being conditionedby said control signal to apply signals from said means for generatingto said output and in the absence of said control signal, said selectorcircuit being conditioned to apply signals corresponding to mostrecently used values from said cache address circuits to said output. 6.The apparatus of claim 5 wherein said cache memory further includescache control circuits coupled to said first gating means, said cachecontrol circuits being operative to generate said control signal forindicating the initialization of said cache memory for establishing saidpriorities of said levels.
 7. The apparatus of claim 3 wherein saidcache memory further includes cache address circuits and cache controlcircuits, said second gating means including a selector circuit havingfirst and second data inputs, a control input and output coupled to saidinput register stage, said first data input being connected to saidcache address circuits, said second data input and said control inputbeing connected to receive an initialization control signal from saidcache control circuits specifying an initialization operation, saidselector circuit being conditioned by said control signal tocontinuously apply to said output, said shift enable signal and saidcache control circuits being operative to inhibit the generation of saidcontrol signal at the completion of said initialization operation, saidselector circuit being enabled by said cache control circuits to applysaid signals indicating the occurrence of page hits from said cacheaddress circuits to said output.
 8. A method of organizing a replacementlevel generator which accurately defines a least recently usedreplacement level value for use by a cache memory organized into nnumber of levels, each level including a plurality of addressable memorylocations for storing information accessed in response to memoryrequests, said cache memory including circuits for generating each levelcontaining a plurality of addressable locations, and said cache memoryincluding circuits for generating a page hit signal indicative of hitbeing detected indicating that the data requested has been located insaid cache memory and a page level number coded for indicating the levelin which said hit occurred, said method comprising the steps of:(a)connecting in tandem, n number of stage registers including an inputstage register and an output stage register representative of assignablelevels n through 0; (b) applying said page hit signal as a shift enablesignal to said input stage register and in tandem to the remainingregisters; (c) initializing said number of stage registers to storereplacement level values 0 through n respectively in the output stage 0through input stage n for establishing the relative priorities by whichthe contents of storage locations within said levels of said memory areto be replaced; (d) simultaneously comparing said page number levelvalue corresponding to the most recently used level with the levelcontents of each register except said output stage register; (e)generating an output signal indicating the result of such comparisonwithin each stage except said output stage; (f) enabling for loadingsaid input stage register and each stage register in succession forshifting in the absence of said output signal indicating an identicalcomparison by said shift enable signal only when said page hit signaldesignates that the information specified by a memory request resides inone of said levels; and, (g) loading input register stage with said pagenumber most recently used value and simultanesouly shifting by loading,the level contents of the input stage and remaining stages throughsuccessive stages up to and including the stage which last generatedsaid output signal indicative of having detected an identicalcomparison, with the level contents of previous stages in response to asingle clock signal so that said output stage now stores the leastrecently used replacement value level with a minimum amount of timebased upon a transfer of said shift enable signal through said n numberof storage register and the use of said information.
 9. The method ofclaim 8 wherein the operations of steps c through f are repeated foreach new input level value in response to each occurrence of a clockingsignal and said enabling of said input stage.